90 research outputs found

    Differentiable Meta logical Programming

    Full text link
    Deep learning uses an increasing amount of computation and data to solve very specific problems. By stark contrast, human minds solve a wide range of problems using a fixed amount of computation and limited experience. One ability that seems crucial to this kind of general intelligence is meta-reasoning, i.e., our ability to reason about reasoning. To make deep learning do more from less, we propose the differentiable logical meta interpreter (DLMI). The key idea is to realize a meta-interpreter using differentiable forward-chaining reasoning in first-order logic. This directly allows DLMI to reason and even learn about its own operations. This is different from performing object-level deep reasoning and learning, which refers in some way to entities external to the system. In contrast, DLMI is able to reflect or introspect, i.e., to shift from meta-reasoning to object-level reasoning and vice versa. Among many other experimental evaluations, we illustrate this behavior using the novel task of "repairing Kandinsky patterns," i.e., how to edit the objects in an image so that it agrees with a given logical concept

    Rebalanced Zero-shot Learning

    Full text link
    Zero-shot learning (ZSL) aims to identify unseen classes with zero samples during training. Broadly speaking, present ZSL methods usually adopt class-level semantic labels and compare them with instance-level semantic predictions to infer unseen classes. However, we find that such existing models mostly produce imbalanced semantic predictions, i.e. these models could perform precisely for some semantics, but may not for others. To address the drawback, we aim to introduce an imbalanced learning framework into ZSL. However, we find that imbalanced ZSL has two unique challenges: (1) Its imbalanced predictions are highly correlated with the value of semantic labels rather than the number of samples as typically considered in the traditional imbalanced learning; (2) Different semantics follow quite different error distributions between classes. To mitigate these issues, we first formalize ZSL as an imbalanced regression problem which offers empirical evidences to interpret how semantic labels lead to imbalanced semantic predictions. We then propose a re-weighted loss termed Re-balanced Mean-Squared Error (ReMSE), which tracks the mean and variance of error distributions, thus ensuring rebalanced learning across classes. As a major contribution, we conduct a series of analyses showing that ReMSE is theoretically well established. Extensive experiments demonstrate that the proposed method effectively alleviates the imbalance in semantic prediction and outperforms many state-of-the-art ZSL methods. Our code is available at https://github.com/FouriYe/ReZSL-TIP23.Comment: Accepted to IEEE Transactions on Image Processing (TIP) 202

    Turn-Level Active Learning for Dialogue State Tracking

    Full text link
    Dialogue state tracking (DST) plays an important role in task-oriented dialogue systems. However, collecting a large amount of turn-by-turn annotated dialogue data is costly and inefficient. In this paper, we propose a novel turn-level active learning framework for DST to actively select turns in dialogues to annotate. Given the limited labelling budget, experimental results demonstrate the effectiveness of selective annotation of dialogue turns. Additionally, our approach can effectively achieve comparable DST performance to traditional training approaches with significantly less annotated data, which provides a more efficient way to annotate new dialogue data.Comment: EMNLP 2023 Main Conferenc

    Post-marketing safety surveillance of sacituzumab govitecan: an observational, pharmacovigilance study leveraging FAERS database

    Get PDF
    Background and objective: Sacituzumab govitecan (SG), the first antibody-drug conjugate targeting human trophoblast cell-surface antigen 2 (Trop-2), has been approved by the Food and Drug Administration (FDA) for the treatment of advanced or metastatic breast cancer and urothelial cancer. However, there is currently a dearth of information regarding the safety profiles of SG in a large sample cohort. The objective of the present study is to investigate SG-related adverse events (AEs) in real-world settings leveraging the FDA Adverse Event Reporting System (FAERS) database to guide the safety management of clinical medication.Methods: The FAERS database was retrospectively queried to extract reports associated with SG from April 2020 to March 2023. To identify and evaluate potential AEs in patients receiving SG, various disproportionality analyses such as reporting odds ratio (ROR), the proportional reporting ratio (PRR), the Bayesian confidence propagation neural network (BCPNN), and the multi-item gamma Poisson shrinker (MGPS) were employed.Results: Overall, 2069 reports of SG as the “primary suspect” were identified. Noteworthy, SG was significantly associated with an increased risk of blood lymphatic system disorders (ROR, 7.18; 95% CI, 6.58–7.84) and hepatobiliary disorders (ROR, 2.68; 95% CI, 2.17–3.30) at the System Organ Class (SOC) level. Meanwhile, 61 significant disproportionality preferred terms (PTs) simultaneously complied with all four algorithms were adopted. Therein, anemia, thrombocytopenia, neutropenia, leukopenia, diarrhea, asthenia, alopecia, and electrolyte imbalance were consistent with the common AEs described in the clinical trials and specification of SG. Furthermore, unexpected significant AEs include colitis (ROR, 12.09; 95% CI, 9.1–16.08), heart rate increased (ROR, 5.11; 95% CI, 3.84–6.79), sepsis (ROR, 4.77; 95% CI, 3.59–6.34), cholestasis (ROR, 6.28; 95% CI, 3.48–11.36), blood bilirubin increased (ROR, 4.65; 95% CI, 2.42–8.94) and meningitis (ROR, 7.23; 95% CI, 2.71–19.29) were also be detected. The median time to onset of SG-related AEs was 14 [interquartile range (IQR), 7–52] days, with the majority occurring within the initial month of SG treatment.Conclusion: Our study validates the commonly known AEs and also found some potentially emerging safety issues related to SG in real-world clinical practice, which could provide valuable vigilance evidence for clinicians and pharmacists to manage the safety issues of SG

    Beyond the Obvious: Evaluating the Reasoning Ability In Real-life Scenarios of Language Models on Life Scapes Reasoning Benchmark~(LSR-Benchmark)

    Full text link
    This paper introduces the Life Scapes Reasoning Benchmark (LSR-Benchmark), a novel dataset targeting real-life scenario reasoning, aiming to close the gap in artificial neural networks' ability to reason in everyday contexts. In contrast to domain knowledge reasoning datasets, LSR-Benchmark comprises free-text formatted questions with rich information on real-life scenarios, human behaviors, and character roles. The dataset consists of 2,162 questions collected from open-source online sources and is manually annotated to improve its quality. Experiments are conducted using state-of-the-art language models, such as gpt3.5-turbo and instruction fine-tuned llama models, to test the performance in LSR-Benchmark. The results reveal that humans outperform these models significantly, indicating a persisting challenge for machine learning models in comprehending daily human life

    Xiezhi: An Ever-Updating Benchmark for Holistic Domain Knowledge Evaluation

    Full text link
    New Natural Langauge Process~(NLP) benchmarks are urgently needed to align with the rapid development of large language models (LLMs). We present Xiezhi, the most comprehensive evaluation suite designed to assess holistic domain knowledge. Xiezhi comprises multiple-choice questions across 516 diverse disciplines ranging from 13 different subjects with 220,000 questions and accompanied by Xiezhi-Specialty and Xiezhi-Interdiscipline, both with 15k questions. We conduct evaluation of the 47 cutting-edge LLMs on Xiezhi. Results indicate that LLMs exceed average performance of humans in science, engineering, agronomy, medicine, and art, but fall short in economics, jurisprudence, pedagogy, literature, history, and management. We anticipate Xiezhi will help analyze important strengths and shortcomings of LLMs, and the benchmark is released in https://github.com/MikeGu721/XiezhiBenchmark .Comment: Under review of NeurIPS 202
    • …
    corecore